Goto

Collaborating Authors

 junction tree


Collaborative and Confidential Junction Trees for Hybrid Bayesian Networks

Neural Information Processing Systems

Bayesian Network models are a powerful tool to collaboratively optimize production processes in various manufacturing industries. When interacting, collaborating parties must preserve their business secrets by maintaining the confidentiality of their model structures and parameters. While most realistic industry scenarios involve hybrid settings, handling both discrete and continuous data, current state-ofthe-art methods for collaborative and confidential inference only support discrete data and have high communication costs. In a centralized setting, Junction Trees enable efficient inference even in hybrid scenarios without discretizing continuous variables, but no extension for collaborative and confidential scenarios exists. To address this research gap, we introduce Hybrid CCJT, the first framework for confidential multiparty inference in hybrid domains with semi-honest, non-colluding adversaries, comprising: (i) a method to construct a strongly-rooted Junction Tree across collaborating parties through a novel construct of interface cliques; and, (ii) a protocol for confidential inference built upon multiparty computation primitives comprising a one-time alignment phase and a belief propagation system for combining the inference results across the Junction Tree cliques. Extensive evaluation on nine datasets shows that Hybrid CCJT improves the predictive accuracy of continuous target variables by 32% on average compared to the state-of-the-art, while reducing communication costs by a median 10.4 under purely discrete scenarios.


XIMP: Cross Graph Inter-Message Passing for Molecular Property Prediction

arXiv.org Machine Learning

Accurate molecular property prediction is central to drug discovery, yet graph neural networks often underperform in data-scarce regimes and fail to surpass traditional fingerprints. We introduce cross-graph inter-message passing (XIMP), which performs message passing both within and across multiple related graph representations. For small molecules, we combine the molecular graph with scaffold-aware junction trees and pharmacophore-encoding extended reduced graphs, integrating complementary abstractions. While prior work is either limited to a single abstraction or non-iterative communication across graphs, XIMP supports an arbitrary number of abstractions and both direct and indirect communication between them in each layer. Across ten diverse molecular property prediction tasks, XIMP outperforms state-of-the-art baselines in most cases, leveraging interpretable abstractions as an inductive bias that guides learning toward established chemical concepts, enhancing generalization in low-data settings.


Leveraging Discrete Function Decomposability for Scientific Design

arXiv.org Artificial Intelligence

In the era of AI-driven science and engineering, we often want to design discrete objects in silico according to user-specified properties. For example, we may wish to design a protein to bind its target, arrange components within a circuit to minimize latency, or find materials with certain properties. Given a property predictive model, in silico design typically involves training a generative model over the design space (e.g., protein sequence space) to concentrate on designs with the desired properties. Distributional optimization -- which can be formalized as an estimation of distribution algorithm or as reinforcement learning policy optimization -- finds the generative model that maximizes an objective function in expectation. Optimizing a distribution over discrete-valued designs is in general challenging because of the combinatorial nature of the design space. However, many property predictors in scientific applications are decomposable in the sense that they can be factorized over design variables in a way that could in principle enable more effective optimization. For example, amino acids at a catalytic site of a protein may only loosely interact with amino acids of the rest of the protein to achieve maximal catalytic activity. Current distributional optimization algorithms are unable to make use of such decomposability structure. Herein, we propose and demonstrate use of a new distributional optimization algorithm, Decomposition-Aware Distributional Optimization (DADO), that can leverage any decomposability defined by a junction tree on the design variables, to make optimization more efficient. At its core, DADO employs a soft-factorized "search distribution" -- a learned generative model -- for efficient navigation of the search space, invoking graph message-passing to coordinate optimization across linked factors.


JTreeformer: Graph-Transformer via Latent-Diffusion Model for Molecular Generation

arXiv.org Artificial Intelligence

The discovery of new molecules based on the original chemical molecule distributions is of great importance in medicine. The graph transformer, with its advantages of high performance and scalability compared to traditional graph networks, has been widely explored in recent research for applications of graph structures. However, current transformer-based graph decoders struggle to effectively utilize graph information, which limits their capacity to leverage only sequences of nodes rather than the complex topological structures of molecule graphs. This paper focuses on building a graph transformer-based framework for molecular generation, which we call \textbf{JTreeformer} as it transforms graph generation into junction tree generation. It combines GCN parallel with multi-head attention as the encoder. It integrates a directed acyclic GCN into a graph-based Transformer to serve as a decoder, which can iteratively synthesize the entire molecule by leveraging information from the partially constructed molecular structure at each step. In addition, a diffusion model is inserted in the latent space generated by the encoder, to enhance the efficiency and effectiveness of sampling further. The empirical results demonstrate that our novel framework outperforms existing molecule generation methods, thus offering a promising tool to advance drug discovery (https://anonymous.4open.science/r/JTreeformer-C74C).


Mode Estimation for High Dimensional Discrete Tree Graphical Models

Neural Information Processing Systems

This paper studies the following problem: given samples from a high dimensional discrete distribution, we want to estimate the leading (δ, ρ)-modes of the underlying distributions. A point is defined to be a (δ, ρ)-mode if it is a local optimum of the density within a δ-neighborhood under metric ρ. As we increase the "scale" parameter δ, the neighborhood size increases and the total number of modes monotonically decreases. The sequence of the (δ, ρ)-modes reveal intrinsic topographical information of the underlying distributions. Though the mode finding problem is generally intractable in high dimensions, this paper unveils that, if the distribution can be approximated well by a tree graphical model, mode characterization is significantly easier. An efficient algorithm with provable theoretical guarantees is proposed and is applied to applications like data analysis and multiple predictions.



Review for NeurIPS paper: Factor Graph Grammars

Neural Information Processing Systems

Clarity: The paper is fairly dense because of the unfortunate 8-page limit, but well and carefully written. I think the most confusing part for readers ls likely to be the conjunction operation -- if there's an extra page in the camera-ready, the presentation here should be slowed down with some qualitative discussion. You should probably clarify early on that you're talking about undirected hypergraphs. Notation in section 2.1: I regard 52-53 as a commutation property, basically vertices(\bar{e}) \bar{vertices(e)}, where \bar lifts from variables or variable-tuples to their labels. I don't understand where the name "att" comes from ("attachment"?) or why you use the name "type" in the way you do.


SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation

arXiv.org Artificial Intelligence

We introduce SeaDAG, a semi-autoregressive diffusion model for conditional generation of Directed Acyclic Graphs (DAGs). Considering their inherent layer-wise structure, we simulate layer-wise autoregressive generation by designing different denoising speed for different layers. Unlike conventional autoregressive generation that lacks a global graph structure view, our method maintains a complete graph structure at each diffusion step, enabling operations such as property control that require the full graph structure. Leveraging this capability, we evaluate the DAG properties during training by employing a graph property decoder. We explicitly train the model to learn graph conditioning with a condition loss, which enhances the diffusion model's capacity to generate graphs that are both realistic and aligned with specified properties. We evaluate our method on two representative conditional DAG generation tasks: (1) circuit generation from truth tables, where precise DAG structures are crucial for realizing circuit functionality, and (2) molecule generation based on quantum properties. Our approach demonstrates promising results, generating high-quality and realistic DAGs that closely align with given conditions.


Generalized Naive Bayes

arXiv.org Machine Learning

In this paper we introduce the so-called Generalized Naive Bayes structure as an extension of the Naive Bayes structure. We give a new greedy algorithm that finds a good fitting Generalized Naive Bayes (GNB) probability distribution. We prove that this fits the data at least as well as the probability distribution determined by the classical Naive Bayes (NB). Then, under a not very restrictive condition, we give a second algorithm for which we can prove that it finds the optimal GNB probability distribution, i.e. best fitting structure in the sense of KL divergence. Both algorithms are constructed to maximize the information content and aim to minimize redundancy. Based on these algorithms, new methods for feature selection are introduced. We discuss the similarities and differences to other related algorithms in terms of structure, methodology, and complexity. Experimental results show, that the algorithms introduced outperform the related algorithms in many cases.


The diameter of a stochastic matrix: A new measure for sensitivity analysis in Bayesian networks

arXiv.org Artificial Intelligence

Their use as a decision support tool in business and OR has been increasing over the years, including case studies in project management (van Dorp, 2020), supply chain (Garvey et al., 2015), marketing (Hosseini, 2021), and logistics (Qazi, 2022), among others. BNs are defined by two components: a directed acyclic graph (DAG) where each node is a variable of interest and edges represent the, possibly causal, relationship between them; a conditional probability table (CPT) for each node of the DAG reporting the probability distribution of the associated variable conditional on its parents. BNs are highly interpretable due to their graphical nature, representing the probabilistic relationships between variables, making it easy for users to understand and trace the influence of one variable on another. With explainability now recognized as critical for the use of AI in applied research (Rudin, 2019), including in OR (De Bock et al., 2023), BNs stand out by providing transparent and intuitive explanations, thereby enhancing trust and clarity in decision-making processes. The underlying DAG and the associated CPTs can be learned from data using machine learning algorithms or elicited using experts' opinions and knowledge. There is now a vast amount of algorithms to learn BN from data (e.g.